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ABSTRACT 

Aims. We report the discovery of a planet with a high planet-to-star mass ratio in the microlensing event MOA-2009-BLG-387, which 
exhibited pronounced deviations over a 12-day interval, one of the longest for any planetary event. The host is an M dwarf, with a mass in the 
range 0.07 Mq < Mhost < 0.49 Mq at 90% confidence. The planet-star mass ratio q = 0.0132 ± 0.003 has been measured extremely well, so at 
the best-estimated host mass, the planet mass is m,, = 2.6 Jupiter masses for the median host mass, M = 0.19 Mq. 

Methods. The host mass is determined from two "higher order" microlensing parameters. One of these, the angular Einstein radius 
6e = 0.31 ± 0.03 mas has been accurately measured, but the other (the microlens parallax tte, which is due to the Earth's orbital motion) is 
highly degenerate with the orbital motion of the planet. We statistically resolve the degeneracy between Earth and planet orbital effects by 
imposing priors from a Galactic model that specifies the positions and velocities of lenses and sources and a Kepler model of orbits. 
Results. The 90% confidence intervals for the distance, semi-major axis, and period of the planet are 3.5 kpc < Dt < 7.9 kpc, 
1.1 AU < a <1.1 AU, and 3.8 yi < P < 7.6 yr, respectively. 
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1. Introduction 

Over the past decade, the gravitational mi crolensing method 
has led to detection of ten exoplanets ( Bond et aP | 2004i 

^ '""35; 'Beaulieu et al.' '2006; 'Gould et al.' '2006"; 

2008: Bennett etal. 200 8: Dong et al. 20Q9b; 
" 2O10HSumietalJl2O10h . which permits the ex- 
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ploration of host-star and planet populations whose mass and 
distance are not probed by any other method. Indeed, since the 
efficiency of the microlensing method does not depend on de- 
tecting light from the host star, it allows one to probe essentially 
all stellar types over distant regions of our Galaxy. In particular, 
microlensing is an excellent method to explore planets around 
M dwarfs, which are the most common stars in our Galaxy, 
but which are often a challenge for other techniques because of 
their low luminosity. Roughly half of all microlensing events 
toward the Galactic bulge stem from stars with mass ^ 0.5 Mq 
JGouM 2000). 

Determining the characteristics and frequency of planets 
orbiting M dwarfs is of interest not only because M dwarfs are 
the most common type of stars in the Galaxy, but also because 
these systems provide important tests of planet formation the- 
ories. In particular, the core accretion theory of giant planet 
formation predicts that giant planets should be less common 
around low-mass stars dLaughhn et al.ll2004t llda"& Linll2005 



Kennedv & K envon"2008l: lD'Angelo et alj|20ld ). whereas the 
gravitational instability model predicts that giant planets can 
form around M dwarfs with sufficiently massive protoplan- 
etary disks (Boss '2006'). In fact, there is accumulating evi- 
dence from radial velocity surveys tha t giant planets are less 



common around lo w-mass primaries (ICumming et al.l 12008 



Johnson et al [2010'). however, these surveys are only sensitive 
to planets wiffi semimajor axes of < 2.5 AU. Since it is thought 
that ffie majority of the giant planets found by radial velocity 
surveys likely formed farther out in their protoplanetary disks 
and subsequently migrated close to their parent star, it is not 
clear whether the relative paucity of giant planets around low- 
mass stars found in these surveys is a statement about the de- 
pendence on stellar mass of migration or of formation. 

Microlensing is complementary to the radial velocity tech- 
nique in that it is sensitive to planets with larger semimajor 
axes, closer to their supposed birth sites. Indeed, based on the 
analysis of 13 wel l-monitored high-m agnification events with 
6 detected planets, Gould et al. ( 2010l) found that the frequency 
of giant planets at separations of ~ 2.5 AU orbiting ~ 0.5 M© 
hosts was quite high and, in particular, consistent with ffie ex- 
trapolation of the frequencies of small-separation giant planets 
orbiting solar mass hosts inferred from radial velocity surveys 
out to the separations where microlensing is most sensitive. 
This suggests that low-mass stars may form giant planets as 
efficiently as do higher mass stars, but that these planets do not 
migrate as efficiently. 

Furthermore, of the ten previously published microlensing 
planets, one was a "supermassive" planet with a very high mass 
ratio: a nip - 3.8Mjup planet or biting an M dwarf of mass 



ffiis planet may pose a challenge to such theories. Gravitational 
instability, on the offier hand, favors the formation of massive 
planets (provided they form at all). 



Current and future microlensing surveys are particularly 
sensitive to large q planets orbiting M dwarf hosts, for several 
reasons. As with other techniques, microlensing is more sen- 
sitive to planets with higher q. In addition, as the mass ratio 
increases, a larger fraction of systems induce an important sub- 
class of resonant-caustic lenses. Resonant caustics are created 
when the planet happens to have a projected separation close 
to the Einstein radius of the primary (Wambsganss 1997). The 
range of separations that give rise to resonant caustics is quite 
narrow for small q, but grows as q^^^. Furthermore, although 
the range of parameter space giving rise to resonant caustics 
is narrow, the caustics themselves and their cross sections are 
large and also grow as q^^^. Thus the probability of detecting 
planets via these caustics is relatively high, and such systems 
contribute a significant fraction of all detected events, particu- 
larly for supermassive planets orbiting M dwarfs. Events due 
to resonant caustics are particularly valuable, as they allow one 
to further constrain the properties and orbit of the planet. This 
is because these events usually exhibit caustic features that are 
separated well in time. When combined with the fact that the 
precise shape of a resonant caustic is extremely sensitive to 
the separation of the planet from the Einstein ring, such light 
curves ar e particularly sensiti ve to orbital motion of the planet 
(see, e.g. jBennett et alJl2010l) . 



M = 0.46 Mq dDong et alJl2009al) . Given their high planet-to- 
star mass ratios q, such planets are expected to be exceedingly 
rare in ffie core-accretion paradigm, so the mere existence of 



Here we present the analysis of the microlensing event 
MOA-2009-BLG-387, a resonant-caustic event, which we 
demonstrate is caused by a massive planet orbiting an M dwarf. 
The light curve associated with this event contains very promi- 
nent caustic features that are well separated in time. These 
structures were very intensively monitored by the microlens- 
ing observers, so that the geometry of the system is quite well 
constrained. As a result, the event has high sensitivity to two 
higher order effects: parallax and orbital motion of the planet. 
In Section 4, we present the modeling of these two effects and 
our estimates of the event characteristics. This analysis reveals 
a degeneracy between one component of the parallax and one 
component of the orbital motion. We explain, for the first time, 
the causes of this degeneracy. It gives rise to very large errors 
in both the parallax and orbital motion, which makes the final 
results highly sensitive to the adopted priors. In particular, uni- 
form priors in microlensing variables imply essentially uniform 
priors in lens-source relative parallax, whereas the proper prior 
for physical location is uniformity in volume element. These 
differ by approximately a factor D*, where D/ is the lens dis- 
tance. In Section 5, we therefore give a careful Bayesian anal- 
ysis that properly weights the distribution by correct physical 
priors. The high-mass end of the range still permitted is elim- 
inated by the failure to detect flux from the lens using high- 
resolution NACO images on the VLT. Combining all available 
information, we find that the host is an M dwarf in the mass 
range 0.07 Mq < Mhost < 0.49 Mq at 90% confidence. 
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2. Observational data 

The microlensing event MOA-2009-BLG-387 was alerted 
by the MOA collaboration (Microlensing Observations in 
Astrophysics) on 24 July 2009 at 15:08 UT, HJD' = HJD - 
2,450,000 = 5037.13, a few days before the first caustic en- 
try. Many observatories obtained data of the event. The ce- 
lestial coordinates of the event are a = 17/!53m50.795 and 
6 = -33°59'25" (J2000.0) coiTesponding to Galactic coordi- 
nates : I = -H356.56, b = -4.097. 

The lightcurve is overall characterized by two pairs of caus - 
tic crossings (entrance plus exit), which together span 12 days 
(see Figur^D. This structure is caused by the source passing 
over two "prongs" of a resonant caustic (see Figure(T] inset). 
Obtaining good coverage of these caustic crossings posed a va- 
riety of challenges. 

The first caustic entrance {HJD' - 5040.3) was de- 
tected by the PLANET collaboration using the South African 
Astronomical Observatory (SAAO) at Sutherland (Elizabeth 
Im) who then issued an anomaly alert at HJD' 5040.4 call- 
ing for intensive follow-up observations, which in turn enabled 
excellent coverage of the first caustic exit roughly one day later 

The second caustic entrance occurred about seven days 
later {HJD' - 5047.1, see FigureO. That the caustic crossings 
are so far apart in time is quite unusual in planetary microlens- 
ing events. Since round-the-clock intensive observations can- 
not normally be sustained for a week, accurate real-time predic- 
tion of the second caustic entrance was important for obtaining 
intensive coverage of this feature. In fact, the second caustic 
entrance was predicted 14 hours in advance, with a five-hour 
discrete uncertainty due to the well-known close/wide s <-> s"' 
degeneracy, where s is the projected separation in units of the 
Einstein radius. The close-geometry crossing prediction was 
accurate to less than one half hour and the caustic-geometry 
prediction was almost identical to the one derived from the best 
fit to the full lightcurve, which is shown in Figure[T] 

The extended duration of the lightcurve anomalies indicates 
a correspondingly large caustic structure. Indeed, the prelimi- 
nary models found a planet/star separation (in units of Einstein 
radius) close to unity, which means that the caustic is resonant 
(see the caustic shape in the upper panel of Figur^I] where the 
source is going upward). 

The event was alerted and monitored by the MOA col- 
laboration. It was also monitored by the Pro bing Lensing 
Anomalies Network collaboration (PLANET; lAlbrow et al- 
ii 998 ) from three different telescopes: at the South African 
Astronomical Observatory (SAAO), as mentioned above, as 
well as the Canopus 1 m at Hobart (Tasmania) and the 60 cm 
of Perth Observatory (Australia). 



The Microlensing Follow Up Network (//FUN ; lYoo et al 



2004 followed the event from Chile (1 .3m SMARTS telescope 
at CTIO) (y, / and H band data). South Africa (0.35 m tele- 
scope at Bronberg observatory). New Zealand (0.40 m and 0.35 
m telescopes at Auckland Observatory (AO) and Farm Cove 
(FCO) observatory, respectively, the Wise observatory (1.0 m 
at Mitzpe Ramon, Israel), and the Kumeu observatory (0.36 m 
telescope at Auckland, NZ). 



The RoboNet collaboration also followed the event with 
their three 2m robotic telescopes : the Faulkes Telescopes 
North (FTN) and South (FTS) in Hawaii and Australia 
(Siding Springs Observatory) respectively, and the Liverpool 
Telescope (LT) on La Palma (Canary Islands). And finally, the 
MiNDSTEp collaboration observed the event with the Danish 
1.54 m at ESO La Silla (Chile). 

Observational conditions for this event were unusually 
challenging, due in part to the faintness of the target and the 
presence of a bright neighboring star. Moreover, the full moon 
passed close to the source near the second caustic entrance. 
As a result, several data sets were of much lower statistical 
quality and had much stronger systematics than the others. We 
therefore selected seven data sets that cover the caustic features 
and the entire lightcurve : MOA, SAAO, FCO, AO, Danish, 
Bronberg, and Wise. They include 118 MOA data points in 
/ band, 221 PLANET data points in / band, 262 juFUN data 
points in unfiltered, R and / bands, and 300 MiNDSTEp data 
points in / band. We also fit the yuFUN CTIO / and V data 
to the final model, but solely for the purpose of determining 
the source size. And finally, we fit //FUN CTIO //-band data 
to the lightcurve in order to compare the //-band source flux 
with the late-time //-band baseline flux from VLT images (see 
Section O)- The SAAO, FCO, AO, Danish, Bronberg, and 
Wise data wer e reduced by MDA using the PYSIS3 software 
jAlbrow et al.l E009(). The FCO, AO, Bronberg, and Wise im- 
ages were taken in white light and suffered from systematic 
effects related to the airmass. Such effects were corrected by 
extracting lightcurves of other stars in the field with similar 
colors to the lens, and assuming that these stars are intrinsically 
constant. 

For each data set, the errors were rescaled to make per 
degree of freedom for the best binary-lens fit close to unity. 
We then eliminated the largest outlier and repeated the process 
until there were no 3 cr outhers. 



2.1. VLT NACO Images 

On 7 June 2010, we obtained high-resolution //-band im- 
ages using the NACO imager on the Very Large Telescope 
(VLT). Since this was approximately 7.7 Einstein timescales 
after the peak of the event, the source was essentially at 
the baseline. The reduction procedures were similar to those 
of MOA-2008-BLG -310, which are described in detail by 
Janczak et all(l2010l) . 



To identify the source on the NACO frame, we first per- 
formed image subtraction on CTIO /-band images to locate its 
position on the /-band frame. We then used the NACO image 
to find relatively unblended stars that could be used to align the 
/-band and NACO frames. There is clearly a source at the in- 
ferred position, but it lies only seven pixels (0.19") from an am- 
bient star, which is 1.35 mag brighter than the "target" (source 
plus lens plus any other blended light within the aperture). This 
proximity induces a 94% correlation coefficient between the 
photometric measurements of the two stars. We therefore esti- 
mate the target error as 0.06 mag. In the NACO system (which 
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is calibrated to 2MASS using comparison stars) the target mag- 
nitude is 

^target,NACO — 

18.25 + 0.06. (1) 

We have an //-band light curve (taken simultaneously with 
V and / at CTIO), and so (once we have established a model fit 
the light curve in Section HJ we can measure quite precisely the 
source flux in the CTIO system, //somce.CTio = 20.03 ± 0.02. To 
compare with NACO, we transform to the NACO system using 
4 comparison stars that are relatively unblended, a process to 
which we assign a 0.03 mag error, finding 



^^source.NACO - 18.35 + 0.03. 



(2) 



The difference, consisting of light from the lens as well as any 
other blended light in the aperture, is 0.10 + 0.07. 

This excess-flux measurement could in principle be due to 
five physical effects. First, it is reasonably consistent with nor- 
mal statistical noise. Second, it could come from the lens. As 
we show in Section |5] this would be consistent with a broad 
range of M dwarf lenses. Third, it could be a companion to the 
source, and fourth, a companion to the lens. Finally, it could be 
an ambient star unrelated to the event. The fundamental impor- 
tance of this measurement is that, for all five of these possibil- 
ities, the measurement places an upper limit on the flux from 
the lens, hence its mass (assuming it is not a white dwarf). 

3. Source properties from color-magnitude 
diagram and measurement of 9e 

To determine the dereddened color and magnitude of the mi- 
crolensed source, we put the best fit color and magnitude of 
the source on an (/, V - I) instrumental color magnitude di- 
agram (CMD) (cf. Fig|2]i, using instrumental CTIO data. The 
magnitude and color of the target are / - 20.62 + 0.04 and 
{V - 1) = -0.42 + 0.01. The mean position of the red clump is 
represented by an open circle at (/, V - I)rc - (16.36, -0.16), 
with an error of 0.05 for both quantities. 

For the absolut e clump magnitude , we adopt Mine = 

. We adopt the mea- 
1.08 ± 0.05 (Fig. 5 



0.25 ± 0.05 from ISennett et all (I2OI0I) 



su red bulge clump co lor (V - /)o,i?c 



of Bensbv et a 



1l2010h an d a G alactocentric distance Rq 



8.0 ± 0.3 kpc ( lYeldaetal]|2010b . We further assume that at 



the longitude (/ = -3.4), the bar lies 0.7 kpc more distant 
than /?() (D. Nataf et al., in preparation), i.e., 8.7 kpc. From 
this, we derive (/, V - /)o,rc = (14.45,1.08) + (0.10,0.05), 
so that the dereddened source color and magnitude are given 
by: (/, y - /)o = A(/, V -/) + (/, V - I)o.rc = (18.71,0.82). 
From (V - I)q we d erive {V - K)() = 1.78 + 0.14 using the 



Bessel & Bretti (119881) color-color relations. 

The color determines the relatio n between deredden ed 
source flux and angular source radius, (IKervella et al I I2OO4I) 



log 20. = 0.5170 - 0.2yo + 0.2755(y - K)o 



(3) 



giving 9t - 0.63 ± 0.06 yuas. With the angular size of the 
source given by the limb-darkened extended-source fit (model 
5, see Table 1), p. = 0.00202 + 0.00003, we derive the angular 
Einstein radius 6e '■ 0e = 0,/pt = 0.31 ± 0.03 mas. 



4. Event modeling 

4.1. Overview 

The modeling proceeds in several stages. We first give an 
overview of these stages and then consider them each in de- 
tail. First, inspection of the lightcurve shows that the source 
crossed over two "prongs" of a caustic, or possibly two sepa- 
rate caustics, with a pronounced trough in between. The source 
spent 1-3 days crossing each prong and 7 days between prongs. 
This pattern strongly implies that the event topology is that of 
a source crossing the "back end" of a resonant caustic with 
s < 1, as illustrated in Figur^H We nevertheless conducted a 
blind search of parameter space, incorporating the minimal 6 
standard static -binary parameters required to describe all bi- 
nary events, as well as p = Ot/dE, the source size in units of the 
Einstein radius. The parameters derived from this fit are quite 
robust. However, they yield only the planet-star mass ratio q, 
but not the planet mass nip - qM, where M is the host mass. In 
principle, one can measure M from (e.g. iGould.2000.) 



M : 



KKe 



(4) 



where jte is the "microlens parallax" and k = 4G/(c^ AU) ~ 
8.1 masMg'. However, while 0e = Ot/p is also quite robustly 
determined from the static solution (and Section |3), tte is not. 

However, the event timescale is moderately long (~ 
40 days). This would not normally be long enough 
to measure the full microlens parallax, but might be 
enough to measure one dimension of the parallax vector 
dGould. Miralda-Escude & Bahcall Moreover, the large 

separation in time of the caustic feat ures could permit d etec- 
tion of orbital motion effects as well (lAlbrow et al. I l2000h . We 
therefore incorporate these two effects, first separately and then 
together We find that each is separately detected with high sig- 
nificance, but that when combined they are partially degenerate 
with each other In particular, one of the two components of the 
microlensing parallax vector tte is highly degenerate with one 
of the two measurable parameters of orbital motion. It is often 
the case that one or both components of tte are poorly mea- 
sured in planetary microlensing events. The usual solution is to 
adopt Bayesian priors for the lens-source relative parallax and 
proper motion, based on a Galactic model. We also pursue this 
approach, but in addition we consider separately Bayesian pri- 
ors on the orbital parameters as well. We show that the results 
obtained by employing either set of priors separately are con- 
sistent with each other, and we therefore combine both sets of 
priors. 
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Fig. 1. Top: Light curve of MOA-2009-BLG-387 near its peak in July 2009 and the trajectory of the source across the caustic 
feature on the right. The source is going upward. We show the model with finite-source, parallax and orbital motion effects. 
Middle: Magnitude residuals. Bottom: Zooms of the caustic features of the light curve. 
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Fig. 2. Instrumental color-magnitude diagram of the field around MOA-2009-BLG-387. The clump centroid is shown by an open 
circle, while the CTIO / and V - I measurements of the source are shown by a filled circle 
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4.2. Static binary 



4.3. Parallax effects 



A static binary-lens point-source model involves six microlens- 
ing parameters: three related to the lens-source kinematics (fo, 
Mo, ffi), where fo is the time of lens-source closest approach, uq 
is the impact parameter with respect to the center of mass of the 
binary-lens system and Ie is the Einstein timescale of the event, 
and three related to the binary-lens system (q, s, a), where q 
and s are the planet-star mass ratio and separation in units of 
Einstein radius, respectively, and a is the angle between the 
trajectory of the source and the star-planet axis. For n - 7 ob- 
servatories, there are 2n photometric parameters, n x (Fs, Ft,), 
which correspond to the source flux and blend flux for each 
data set. These are usually determined by linear regression. The 
radius of the source, p, in Einstein units, can also be derived 
from the model provided that the source passes over, or sufli- 
ciently close to, a caustic structure. To optimize the fit in terms 
of computing time, we adopt different methods for implement- 
ing finite-source effects, depending on the distance between 
the source and the caustic features in the sky plane. When the 
source is far from the caustic (in the wings of the lightcurve), 
we treat it as a point source. In the caustic crossing regions, 
we use a finite-source model based on the Green-Stokes the- 



orem dGould & Gaucherel . 1997). Numerical implementation 
of this m ethod is adapted from the code that was originally de - 
vised for lAlbrowet al.l (iQOl) and refined in lAnet al.l jZOOlh . 
This technique, which reduces the 2-dimensional integral over 
the source to a 1 -dimensional integral over its boundary and 
so is extremely efficient, implicitly assumes that the source 
has uniform surface brightness, i.e., is not limb darkened. 
We then include limb-darkening in the final fit, as described 
in Section 14.61 Lastly, in the intermedi ary regions, we use 
the hexadeca pole approximation ( Peicha & Hevrovskv I l2009t 
Gould Il2008h . which consists of calculating the magnification 
of 1 3 points distributed over the source in a characteristic pat- 
tern. To fit the microlensing parameters, we perform a Markov 
Chain Monte Carl o (MCMC) fitting with an adaptive step-siz e 
Gaussian sampler JPoran & Muller II2004I: bong et al."2009a). 



After every 200 links in the chain, the covariance matrix be- 
tween the MCMC parameters is calculated again. We proceed 
to five runs corresponding to five different configurations: with- 
out either parallax or orbital motion, with parallax only, with 
orbital motion only, with both effects, and finally with both ef- 
fects and limb-darkening effects included. The results are pre- 
sented in Sectionl4!7] 



The static binary search without parallax leads to the fol- 
lowing parameters: q = 0.0107, s = 0.9152, p = 0.00149, and 
then 6e - 0.42 mas, implying 



Mn^ei = — - 22M0yuas 

K 



(5) 



This product is consistent, for example, with a 1 Mq mass host 
in the Galactic bulge or a 0.025 Mq mass brown-dwarf star at 
1 kpc, either of which would have very important implications 
for the nature of the q = 0.0107 planet. We therefore first in- 
vestigate whether the microlens parallax can be measured. 



When observing a microlensing event, the resulting flux for 
each observatory-filter / can be expressed as. 



Fi(t) = F,jA[u(t)] + Fhj, 



(6) 



where ,■ is the flux of the unmagnified source, Fbj is the back- 
ground flux and u{t) is the source-lens projected separation in 
the lens plane. The source-lens projected separation in the lens 
plane, u{t) of Eq. (|6]l, can be expressed as a combination of two 
components, T(f) and /3{t), its projections along the direction of 
lens-source motion and perpendicular to it, respectively: 



(7) 



If the motion of the source, lens and observer can all be con- 
sidered rectilinear, the two components of u{ f) are given by 



T(f) 



Pit) = MO- 



(8) 



To i ntroduce parallax effects, we use the geocentric for- 
malism IXn et al.l I2OO2I: iGouldl 120041) which ensures that the 
three standard microlensing parameters (fo, fE, mo) are nearly 
the same as for the no-parallax fit. Hence, two more parame- 
ters are fitted in the MCMC code, i.e., the two components of 
the parallax vector, n^, whose magnitude gives the projected 
Einstein radius, - AU/tte and whose direction is that of 
lens-source relative motion. The parallax effects imply addi- 
tional terms in the Eq. ([8]l 



f - fo 

T(t) ^ + dTit) 

IE 

where 



/3(t) ^uo + Spit) 



(6T(t), 613(1)) = tteA^o = (^E-Apo, ^e x Apo) 



(9) 



(10) 



and Apo is the apparent position of the Sun relative to what it 
would have been assuming rectilinear motion of the Earth. 

The configuration with parallax effects corresponds to 
Model 2 of Table 1 , The resulting diagram showing the north 
and east components of tte is presented in Figure |3] Taking 
the parallax effect into account substantially improves the fit 
(A;^'^ - -52). The best fit allowing only for parallax is tie - 
(-1.38, 0.60). There is a hard 3cr lower limit tte > 0.6 and a 3cr 
upper limit tte < 1.9. If taken at face value, these results would 
imply 0.025 < M/Mq < 0.075, i.e., a brown dwarf host with a 
gas giant planet. However, as can be seen from Figure |3] these 
results are inconsistent with the results from Model 4, which 
takes account of both parallax and orbital motion. This incon- 
sistency reflects an incorrect assumption in Model 2, namely 
that the planet is not moving. 

4.4. Orbital motion effects 

For the planet or bital motion, we use the formalism of 



Dong et alJ (l2009ah . The lightcurve is capable of constraining 
at most two additional orbital parameters that can be interpreted 
as the instantaneous velocity components in the plane of the 
sky. They are implemented via two new MCMC parameters 
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Fig. 3. The contours at 1, 2, 3, and 4 cr in black, red, orange, 
and green, respectively. As a comparison, the gray points show 
the approximate 3cr region of Model 4, i.e., with both parallax 
and orbital motion effects, with the Icr contour shown in black. 
The black cross shows the (0,0) coordinates. 



Fig. 4. Orbital parameters of solutions at 1, 2, 3, and 4cr in 
black, red, orange, and green, respectively. As a comparison, 
the gray points show the 3cr region of Model 4, i.e., with both 
parallax and orbital motion effects, with the Icr contour shown 
in black. 



dsldt and which are the uniform expansion rate in binary 
separation s and the binary rotation rate a. 



s — sq + dsldt (t - to) a - cq + a)(t - to). 



(11) 



These two effects induce variations in the shape and ori- 
entation of the resonant caustic, respectively. To ensure that 
the resulting orbital characteristics are physically plausible, we 
can verify for any trial solution that the projected velocity of 
the planet is not greater than the escape velocity of the sys- 
tem, Vj^ < Vesc for a given assumed mass and distance, where 
dPong et al.ll2009ah 



= ^Jids/dt)^ + ia>s)^D,eE 



and 



2GM 



2GM 



- s0eDi. 



(12) 



(13) 



The configuration with only orbital motion corresponds to 
the Model 3 of Table 1 . The resulting diagram showing the so- 
lution for the two orbital parameters co and ds/dt is presented 
in Figure|4] Taking the orbital motion of the planet into account 
substantially improves the fit (Ax^ - -67.5). 

4.5. Combined parallax and orbital motion 

In this section we model both parallax and orbital motion 
effects, which is called Model 4 in Table 1 . Taking these two 
effects into account results in only a modest improvement in 
compared to the cases for which the effects are considered 
individually Cr^ , ,^ _ ^2^^^^^^^ ^ _gy xhe triangle diagram pre- 
sented in Figure |5] shows the 2-parameter contours between 



the four MCMC parameters tte.n, ^e,e, and ds/dt intro- 
duced in Sections 14.31 and [4.41 The best fit is (tte.n^^e.e) - 
(2.495, -0.311) and (w,t/.?/c/f) = (-0.738, -0.360). This would 
lead to a host star of 0.015 Mq at a distance Di = 1.11 kpc and 
a 0.21 Jupiter mass planet with a projected separation of 0.32 
AU. 

This small improvement in x^ can be explained by a de- 
generacy between the north component of tte and the orbital 
parameter w, as shown in Figure |5] In fact, the actual degen- 
eracy is between tte.j. and w, where jte,± (described bv iGould 



20041) is the component of ^Te that is perpendicular to the in- 



stantaneous direction of the Earth's acceleration, i.e., that of 
the Sun projected on the plane of the sky at the peak of the 
event. This acceleration direction is <p - 257.4° (north through 
east). Hence, the perpendicular direction is - 90° = 192.6°, 
which is quite close to the 195.7° degeneracy direction in the 
tte.n and tte.e diagram. Since tte.x is very close (only 13°) from 
north, tte.n is a good approximation for it. 

Indeed, tte.h generates an asymmetry in the lightcurve be- 
cause, to the extent that the source-lens motion is in the direc- 
tion of the Sun-Earth axis, the event rises faster than it falls 
(or vice versa). This effect is relatively easy to detect. But to 
the extent that the motion is perpendicular to this axis, the 
Sun's acceleration induces a parabolic deviation in the trajec- 
tory. To lowest order, this produces exactly the same effect as 
rotation of the lens geometry (which is a circular deviation). 
Hence, the degeneracy between nE,± and o) can only be broken 
at higher order. Th is degeneracy was discussed in the contex t 
of point lenses in Gould. Miralda-Escude & Bahcall I ( 1994 ). 
Smith. Mao & Paczvnskil (l2003ah . and lOouldl (l2004 . In the 



point-lens case, the tte.^ degeneracy appears nakedly (because 
the lens system is invariant under rotation). In the present case, 
the rotational symmetry is broken. In case orbital motion is ig- 



8 



V. Batista etal: MOA-2009-BLG-387 



nored, it thus may appear that parallax is measu red more eas- 
ily in binary events, as originally suggested bv lAn & Gould 
(1200 ih . But in fact, as shown in the present case, once the caus- 
tic is allowed to "rotate" (lowest order representation of orbital 
motion), then the tte.j. degeneracy is restored. 
4.6. Limb-darkening implementation 

Most of the calculations in this paper are done using Stokes' 
theorem, which greatly speeds up the computations by reduc- 
ing a 2-dimensional integral to one dimension. However, this 
method implicitly assumes that the source has uniform surface 
brightness, whereas real sources are limb darkened. In the lin- 
ear approximation, the normalized surface brightness can be 
written 



w(z; r) = 1 - r 1 



VT 



(14) 



where F is the limb-darkening coefficient depending on the 
considered wavelength, and z is the position on the source di- 
vided by the source radius. 

We adopt this approach because we expect that the solu- 
tions with and without limb darkening will be nearly identical, 
except that the uniform source should appear smaller by ap- 
proximately a factor 



(15) 



because this ratio preserves the rms radial distribution of light. 

To test this conjecture, we approximate the surface as a 
set of 20 equal-area rings, with the magnification of each 
ring still computed by Stokes' method. The surface bright- 
ness of the ith ring is simply W(zi) where z, is the mid- 
dle of the ring. The limb-darkening coefficients for the unfil- 
tered data have been determined by interpolation, from V, R, 
I and H limb-darkening coefficients. We find from the CMD 
that the source star has {V - I)o - 0.82, so roughly a G7 
dwarf or slightly cooler We adopt a temperature of T = 
5500 K. We thus obtain the following limb-darkening param- 
eters (My, Mr, m/, uh) ^ (0.7117 0.6353, 0.5507, 0.3659), where 
M = 3F/(F + 2) jAfonso et al.l l2000V Then (Fy, Fr, F/, F^) = 
(0.6220, 0.5373, 0.4497, 0.2778). For a given observatory/filter 
(or possibly unfiltered), we then compare (Robserved - Ictio) 
to {VcTio - IcTio)^ considering that Ictio - O.OVV + 
0.93/ and that approximately V - 2R - I and de- 
duce empirical expression for the corresponding F coef- 
ficients. The F coefficients for all the observatories then 

become (TmOA, ^SAAO^ ^FCO^ ^AO, ^Danish, Fb, 

-onherg-> 

(0.493,0.45,0.52,0.51,0.45,0.53,0.49). Substituting, a mean 
F ~ 0.47 into Eq. ( fTSl l. we expect p to be ~ 5% larger when 
limb-darkening is included. 



with parallax effects only; Model 3: Finite-source binary-lens 
model with orbital motion effects only; Model 4: Finite-source 
binary-lens model with both parallax and orbital motion ef- 
fects; and Model 5: Finite-source binary-lens model with both 
parallax and orbital motion effects and limb-darkening. 

Note in particular that Models 4 and 5 agree within ~ 1 cr 
for all parameters, except that p is ~ 7% greater in the limb- 
darkened case (Model 5). 

5. Bayesian analysis 

The Markov Chain used to find the solutions illustrated in 
Figure |5] is constructed (as usual) by taking trial steps that are 
uniform in the MCMC variables, including fo, mq, and fn- This 
amounts to assuming a uniform prior in each of these variables. 
In the case of the three variables fo, mq, and the solution is 
extremely well constrained, so it makes hardly any difference 
which prior is assumed. Whenever this is the case, Bayesian 
and frequentist orientations lead to essentially the same results. 
However, as shown in Figure |5] is quite poorly constrained: 
at the 2cr level, the magnitude of tte varies by more than an 
order of magnitude. Since the lens distance is related to the mi- 
crolens parallax by Di - AU/(^7rE + ^s), where - AU/Ds, 
this amounts to giving equal prior weight to a tiny range of dis- 
tances nearby and a huge range of distances far away. But the 
actual weighting should have the reverse sign, primarily be- 
cause a fixed distance range corresponds to far more volume 
at large than small distances. In fact, a Galactic model should 
be used to predict the a priori expected rate of microlensing 
events, which depends not only on the correct volume element 
but also on the density and velocity distributions of the lens and 
the source as well. 

Similarly, a Keplerian orbit can be equally well character- 
ized by specifying the seven standard Kepler parameters or 
six phase-space coordinates at a given instant of time, plus 
the host mass. The latter parametrization is more convenient 
from a microlensing perspective because microlensing most 
robustly measures the two in-sky-plane Cartesian spatial co- 
ordinates (s cos a and s sin a) and the two in-plane Cartesian 
velocity coordinates (ds/dt and so)), while the mass is directly 
given by microlens variables M = Oe/ktte- However, the for- 
mer (Kepler) variables have simple well-established priors. By 
stepping equally in microlens parameters, one is effectively as- 
suming uniform priors in these variables, whereas one should 
establish the priors according to the Kepler parameters. 

In principle, one would simultaneously incorporate both 
sets of priors (Galactic and Kepler), and we do ultimately adopt 
this approach. However, it is instructive to first apply them sep- 
arately to determine whether these two sets of priors are basi- 
cally compatible or are relatively inconsistent. 



4. 7. Results summary 

We summarize the best-fit results for the five different mod- 
els presented in Section|4]in Table 1 . The five models are Model 
1 : Finite-source binary-lens model with neither parallax nor or- 
bital motion effects; Model 2: Finite-source binary-lens model 
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Fig. 5. Parallax and orbital motion parameters of solutions contours at 1, 2, 3, and 4cr. The black crosses show the (0,0) 
coordinates. 



Table 1. Fit parameters for finite- source binary-lens models 



Model 


to 


«o 


ts 


s 


1 


a 


P 








ds/dt 












Error bars 












Model 1 


5042.34 


0.0683 


48.7 


0.9152 


0.01073 


4.3074 


0.00149 










1100 


0.01 


0.0005 


0.4 


0.0002 


0.00015 


0.0025 


0.00002 










Model 2 


5042.38 


0.0770 


43.9 


0.9137 


0.01230 


4.3063 


0.00174 


-1.38 


0.60 






1048 


0.02 


0.0015 


0.5 


0.0004 


0.00030 


0.0030 


0.00005 


0.25 


0.07 






Model 3 


5042.32 


0.0902 


38.4 


0.9137 


0.0135 


4.302 


0.00197 






-0.252 


-0.409 


1032.5 


0.02 


0.002 


0.6 


0.0003 


0.0002 


0.002 


0.00005 






0.1 


0.04 


Model 4 


5042.366 


0.0890 


40.1 


0.9134 


0.0135 


4.3095 


0.00195 


2.5 


-0.31 


-0.74 


-0.36 


1024.5 


0.015 


0.0010 


0.5 


0.0002 


0.0002 


0.0025 


0.00003 


1 


0.3 


0.2 


0.05 


Model 5 


5042.36 


0.0881 


40.0 


0.9136 


0.0132 


4.3099 


0.00202 


1.7 


-0.15 


-0.51 


-0.37 


1029.2 


0.02 


0.0010 


0.5 


0.0003 


0.0002 


0.0025 


0.00003 


1 


0.5 


0.3 


0.05 
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Formally, we can evaluate the posterior distribution f(X | 
D), including both prior expectations from (Galactic and/or 
Keplerian) models and posterior observational data using 
Bayes' Theorem : 



f( X I D) 



f(D I X)f(X) 
f(D) 



(16) 



Here f(D \ X) is the likelihood function over the data D for 
a given model X, f{X) is the prior distribution containing all 
ex ante information about the parameters X available before 
observing the data, and f{D) = ^^f{D \ X)f{X)dX. In the 
present context, this standard Bayes formula is interpreted as 
follows: the density of links on the MCMC chain directly gives 
f(D I X), while fiX) encapsulates the parameter priors, in- 
cluding both the underlying rate of events in a "natural physi- 
cal coordinate system" in which these priors assume a simple 
form and the Jacobian of the transformation from this "phys- 
ical" system to the "natural microlensing parameters" that are 
directly modeled in the lightcurve analysis. 

It is not obvious, but we find below that the coordinate 
transformations for Galactic and Kepler models actually fac- 
tor, so we can consider them independently. 

5.1. Galactic model 

Applying the generic rate formula F = ncrv to microlens- 
ing rates as a function of the independent physical variables 
(M,Di,fi), yields 



c/^F 



dDi dM d^fi 



v(x,y,z)(2REMUx)g(M), (17) 



where the spatial positions (x, y, z), the physical Einstein radius 
Re, and the lens velocity relative to the observer-source line 
of sight Viei are all regarded as dependent variables of the four 
variables shown on the l.h.s., plus the two angular coordinates. 
Here v(x, y, z) is the local density of lenses, g{M) is the mass 
function [we will eventually adopt g{M) oc M"'], and /(/i) is 
the two-dimensional probability function for a given source- 
lens relative proper motion, //. Since v,ei = fJ^Di and Re = Di9e, 
this can be rewritten in terms of microlensing variables. 



t/^F 



d'^r 



_1 

dtE dBE d^TTE dDi dM d'^ji tte 



d{DL,M,n) 



2DteEiJvix,y,z)mg(M) x —D 



d(tE, 0E, 7!"e) 



where M = Oe/ktte, Di - AU/(;ri-ei -i- /r^), Tirei = Oe^e, and 
fi = ^e/^e are now regarded as dependent variables. We note 
that 

d(DL, M,ti) 5(7r,ei, M,ju) dD^ 2;r,-ei M fi DI 



d{ tE,9E,7rE) d{tE,0E,7TE) dnrei ^e TTe AU ' 

where the last evaluation follows from the general theorem: 



d(yd 

d{xj) d{\nxj)WjXj ' 'WjXj' 



d{\nyi) Y\iyi 11/ J; 

- \a\- 



Finally, Eq. (fTTl i reduces to 

d^T 4 

,2 = —v{x,y,z)m[g{M)M]^. 
dtE dOE a tte AU tie 



(18) 



The variables on the l.h.s. of Eq. (fTST l are essentially the 
Markov chain variables in the microlensing fit procedure, [j The 
distribution of MCMC links applied to the data can be thought 
of as the posterior probability distribution of the Markov- chain 
variables under the assumption that the prior probability dis- 
tribution in these variables is uniform. In our case, the prior 
distribution is not uniform, but is instead given by the rh.s. of 
Eq. dull. We therefore must weight the output of the MCMC 
by this quantity, which is the specific evaluation of f(X) in 
Eqs. ([T6]l and (fTTl i. 

As mentioned above, we adopt g{M) oc M"', so the term in 
square brackets disappears. We evaluate v{x,y,z) and fip) as 
follows. 



5.1 .1 . Lens-source relative proper motion distribution 

m 

To compute the relative proper motion probability, we assume 
that the velocity distributions of the lenses and sources are 
Gaussian /(v,,, v,) - /(vy)/(v,) where 



/(My) = /(v>0 



dVy 
dHy 



D, 



1 



exp 



(Vy - Vy)2 



(19) 



and a similar distribution for fifiz)- Here v,. and v- are compo- 
nents of the projected velocity v derived from the MCMC fit, 
which is expressed by v = fiDi, where 



M = — — • 

IE 



(20) 



The expected projected velocity which appears in Eq[T9]is de- 
fined as 



V -Vi- 



D, D,, 

Vs + Vo 



(21) 



where Di, Dj are respectively the lens and source distances 
from the observer and D/, the lens-source distance. The ve- 
locity is expressed in the (x, y, z) coordinate system, centered 
on the center of the Galaxy, where x and z axes point to the 
Earth and the North Galactic pole, respectively. As given in 



Han & Gould 



(Il995h 

-1 



we adopt v^^dhk 



VzMige = and 

o'z.disk - 20km. s"', (Trjjuige - 100km. s"' for the z compo- 
nent of the velocity. For the y direction, Vy^disk - 220km.s~', 
VyMdge = unAa-y^disk = 30km.s"', a-y^huige = lOOkm.s"' de- 
pending on whether the lens is situated in the disk or in the 
bulge. We also consider the asymmetric drift of the disk stars 
by subtracting lOkm.s"' from Vyjisk- The celestial north and 
east velocities of the Earth seen by the Sun at the time of the 
event are Ve - (ve.e,ve,n) = (-1-22.95, -3.60) km. s"'. In the 



' In fact, p is used in place of 9e, but this makes no difference, since 
6»E oc p). 
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Galactic frame, the galactic north and east components of the 
Earth velocity become 



VE,NorthGai - ve,n COS 59.7° - x'e.e sin59.7°, 

VE,EastGal = V£_iv sin 59.7° + V'£,£ COS 59.7°. 



(22) 
(23) 



The velocity of the Sun in the Galactic frame is Vq = 
(7, 12)km.s"' + (0, Vciic), where Vcirc = 220 km. s"', from which 
we deduce the velocity Vg of the observer in the Galactic frame 
by adding the Earth velocity from Eq. (l22l l. 

5.1.2. Density distribution v{x,y,z) 

The density distribution, v(x, y, z), is given at the lens coordi- 
nates (x,y,z) in the Galactic frarne. For this distribution, we 
adopt the model of lHan & Gould I (I2003I) . which is based pri- 
marily on star counts, and, without any adjustment, reproduces 
the microlensing optical depth measured toward Baade's win- 
dow. The density models are given in Table 15.3! The disk pa- 
rameters are H = 2.75 kpc, hi = 156 pc, h2 - 439 pc, and 
= 0.381, where R = {x^ + y^)'^^. For the barred (anisotropic) 

bulge model, = {[(x'/xof + (y' /yof? + (z'lzaf)"^- Here 
the coordinates ix' ,y' ,z') have their center at the Galactic cen- 
ter, the longest axis is the x' , which is rotated 20° from the 
Sun-GC axis toward positive longitude, and the shortest axis 
is the z' axis. The values of the scale lengths are xo - 1.58 
kpc, yo = 0.62 kpc an d zg - 0.43 kpc respectively. For 
the bulge, iHan & Gouldl (l2003h norn iaUze the "G2" K- band 
integrated-light-based bar model of Dwek et a P (1 199 51) us- 
ing sta r co unts toward Baade's window from holtzma n et al. 



(119981) and lZoccali et a l. (2000). For the disk, they incorporate 
the model of lZheng et al. (JOOl), which is a fit to star counts. 

In the calculation, we sum the probabilities of disk and 
bulge locations for the lens. We set the limits of the disk range 
to be [0, 7] kpc from us and [5,11] kpc for the bulge range. We 
also apply the bulge density distribution to the source, in the 
[6.5, 11] kpc range. Rigorously, because we already know the 
dereddened flux of the source, we should have derived a distri- 
bution of sources from the luminosity distribution of bulge stars 
combined with their distance. However, as we do not know 
the precise distribution of bulge luminosities at fixed color, we 
only consider the density distribution of sources as a function 
of their position in the bulge only. Because the stellar density 
drops off very rapidly from the peak, the source is effectively 
localized as being close to the Galactocentric distance. 

5.2. Orbital motion model 

In addition to the Galactic model, we build a Keplerian model 
to put priors on the orbital motion of the planet. To extract the 
orbital parameters fr om the microlensing parameters, we re- 
fer to the appendix of iDong et al.l (l2009al) . Given that from the 
light curve of the event we have access to the instantaneous 
projected velocity and position of the planet for only a short 
time, we consider a circular orbit to model the planet motion. 
The distortions of the light curve are modeled by o) and ds/dt, 
which then specify the variations in orientation and shape of 



the resonant caustic, respectively. These quantities are defined 
in Section 14.4! Since rj_ = Didgd is the projected star-planet 
separation, we evaluate the instantaneous planet velocity in the 
sky plane, with r±y^ = r±a) the velocity perpendicular to the 
planet-star axis and r^y^ = r±{ds/dt)/s the velocity parallel 
to this axis. We define the /, j, k directions as the instantaneous 
star-planet axis on the sky plane, the direction into the sky, and 
k - i X j. In this frame, the planet is moving among two di- 
rections, defined by the angles 6 and (p, which are effectively 
a (complement to a) polar angle and an azimuthal angle, re- 
spectively. Specifically, (p is the angle between the star-planet- 
observer (r^ = a sin0), and 9 characterizes the motion in the 
direction of the velocity along k. Then the instantaneous veloc- 
ity of the planet is 



GM , 



V - -i [cos 6k + sin 0(cos (pi - sin d> j)] 

a 



(24) 



where a is the semimajor axis. Thus we obtain = 

and - sin cot 0. The Jacobian expression to trans- 

form from P{s, y^, yy) to P(a, (p, 9) is 



J = 



d{a, (p,9) cr' j /I .9 9 
= tan^ d>{- - sm^ 9tan^ i 



(25) 



As explained in iDong et al.l ( l2009ah . for one set of mi- 
crolensing parameters, there are two degenerate solutions in 
physical space. In the orbital model, we consider the two solu- 
tions to constrain the light curve fit, each with its own separate 
probability. 

From the definition of the two angles, the transformation 
of the polar system (fl,7r/2 - d,(p) contains the quantity sinO 
and so the Jacobian includes the factor cos 9 from ii(sin 9)d(p - 
d9d(p cos 9. Moreover, we adopt a flat distribution on In(fl), im- 
plying the factor 1/a in the Jacobian expression. Then, 

5(ln(fl), 0, sinff) cos0/l 9 9 \-i 

5(i,r±.ri|) GMcos2 0V2 

Note that the terms sin 9 and cos 9 in the denominators of 
Eq. 



correct an error in Dong et al.l (12009a I 



5.3. Constraints from VLT 



As foreshadowed in Section l27Tl the VLT NACO flux measure- 
ment places upper limits on the flux from the lens, hence on 
its mass (assuming it is not a white dwarf). However, we be- 
gin by assuming that the excess light is caused by the lens. We 
do so for two reasons. First, this is actually the most precise 
way to enforce an upper limit on the lens flux. Second, it is of 
some interest to see what mass range is "picked out" by this 
measurement, assuming the excess flux is due to the lens. 

The first point to note is that, if the lens contributes any 
significant flux, then it lies behind most or all of the dust seen 
toward the source. For example, if the lens mass is just M = 
0.15 M0 (which would make it quite dim, Mh > 8), then it 
would lie at distance Dl = AU/(91/kM + AU/Ds) = 4.9 kpc, 
where we have adopted the central values 9e = 0.3 1 mas and 
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Ds - 8.7 kpc for this exercise. More massive lenses would be 
farther. 

Next we estimate Ah - 0.4 from the measured clump color 
(V - /)ci - 2.10, assuming an intri nsic color of the red giant 
clump of (V - /)o,ci = 108 (iBensbv etliLiiloiOl) and adopting 
for this Hne of sight Ah/E(V - /) = 0.40. 

Finally, for the relation between M and M h, we consult th e 
library of empirically-calibrated isochrones of lAn et al.ld2007h . 
We adopt the oldest isochrones available (4 Gyr), since there is 
virtually no evolution after this age for the mass range that will 
prove to be of interest M < 0.7 Mq. Moreover, in this mass 
range, the isochrones hardly depend on metallicity within the 
range explored (-0.3 < [Fe/H] < +0.2). 
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Table 2. Density distribution for the bulge and disk models 



Location 


Model 




Distribution (in Mq pc ■*) 


Bulge 


Dwek 




1.23exp(-0.5r2) 


Disk 


Zheng 


y{R, z) 


= 1.07 exp(-i?///)[(l -p){-\z\lh,) +/3exp(-z//i2)] 




_2 -1.5 -1 -.5 -2 -1.5 -1 -.5 




log (M,„3/M3.J log (M,„3/M3.n) 

Fig. 6. Bayesian analysis results. Each panel shows host mass M versus lens-source relative parallax n^^u with 1, 2, 3, and 4cr 
contours under two different conditions. The solid black contours are derived from the light curve alone, without any priors. The 
colored symbols show contour levels after applying various priors, respectively Galactic proper motion only, Kepler only, full 
Galactic and Kepler priors, and full Galactic and Kepler priors, plus VLT imaging constraints. The proper-motion and Kepler 
priors are fully consistent with the light curve, but there is strong tension between between the distance-related priors and the 
lightcurve, with the former favoring high masses and small lens-source separations. The highest part of this disputed mass range, 
M > 0.7 Mq, is essentially ruled out by the VLT imaging constraint (lower right). 



14 



V. Batista etal: MOA-2009-BLG-387 



For each mass and distance considered below, we then cal- 
culate Hl - Mh+Ah + 5 log(D£/10pc) and combine the corre- 
sponding flux with Hs = 18.35 to obtain i/pied- We then calcu- 
late a likelihood factor L/f = exp[-(//pied-//obs)^/2cr^], where 
//obs = 18.25 and (Th = 0.07, as discussed in Section l2~n 

For fiducial values Ds - 8.7 kpc and Be = 0.31 mas, this 
likelihood peaks at M = 0.42 Mq, but it does so very gently. 
The suppression factor is just Lfj ~ 0.7 at M = 0.21 Mq and 
M = 0.52 Mq. At lower masses, even if there were zero flux, 
the suppression would never get lower than Lh - 0.36, simply 
because the excess-flux measurement is consistent with zero at 
1 .4 cr. But at higher mass, the expected flux quickly becomes 
inconsistent. For example, Lh(0.65 Mq) - 0.07. 

Hence, by treating the flux measurement as an excess-flux 
"detection", we impose the "upper limit" on mass in a graceful 
manner. Moreover, as regards the upper limit, this approach re- 
mains valid when we relax the assumption that the excess flux 
is solely due to the lens. That is, even if there are other contribu- 
tors, the likelihood of a given high-mass lens being compatible 
with the flux measurement can only go down. 

However, the same reasoning does not apply at the low- 
mass limit. For example, if the excess flux came from a source 
companion or an ambient star, then a brown-dwarf lens would 
be fully compatible with the flux measurement. Nevertheless, 
this is quite a minor effect because, in any event, the suppres- 
sion factor would not fall below 0.36. To account for other po- 
tential sources of light, we impose a minimum suppression fac- 
tor Ln.min = 0.5 at the low-mass end. 

5.4. Combining Galactic and Kepler priors and adding 
VLT constraints 

In this section, we impose the priors from the Galactic and 
Kepler models and add the constraints from the VLT flux mea- 
surement. We defer the VLT constraints to the end because they 
do not apply to the special case of white-dwarf lenses. 

We begin by examining the role of the various priors sepa- 
rately to determine the level of "tension" between these and the 

derived from the light curve alone. We do so because each 
prior involves dififerent physical assumptions, and tension with 
the Ught curve may reveal shortcomings in these assumptions. 

The Kepler priors involve two assumptions, first that the 
planetary system is viewed at a random orientation (which is 
almost certainly correct) and second that the orbit is circular 
(which is almost certainly not correct). We will argue further 
below that the assumption of circular orbits has a modest im- 
pact. In any event, we want to implement the Kepler priors by 
themselves. 

The Galactic priors really involve two sets of assumptions. 
The more sweeping assumption is that planetary systems are 
distributed with the same physical-location distribution and 
host-mass distribution as are stars in the Galaxy. We really 
have no idea whether this assumption is true or not. For ex- 
ample, it could be that bulge stars do not host planets. The 
assumptions about host mass and physical location are linked 
extremely strongly in a mathematical sense (even if they prove 
to be unrelated physically) because 6e is well-measured, and 



6^ = kM TTi-si. Thus, we must be cautious about this entire set of 
assumptions. 

However, the Galactic priors also contain another factor 
f(ji), in which we can have greater a priori confidence. This 
prior basically assumes that planetary systems at a given dis- 
tance (regardless of how common they are at that distance) will 
have similar kinematics to the general stellar population at the 
same distance. The scenarios in which this assumption would 
be strongly violated, while not impossible, are fairly extreme. 

Therefore we begin by imposing proper-motion-only and 
Kepler-only priors in the top two panels of Figure |6] which 
plots host mass M versus lens-source relative parallax TTrei. We 
choose to plot TTiei rather than Dl because it is given directly 
by microlensing parameters n^ei - tteOe- The 1, 2, 3, and 4 cr 
contours from the based on the light curve only are shown 
in black. Each of these priors is consistent with the light curve 
at the 1 cr level, so we combine them and find that they stiU dis- 
play good consistency. In the lower left panel, we combine the 
full Galactic and Kepler priors. These tend to favor much heav- 
ier, more distant lenses, which are strongly disfavored by the 
lightcurve, primarily because of the factor D^/7r,ei in Eq. (fTST l. 
Indeed masses M > 0.7 Mq will be effectively ruled out by 
high-resolution VLT imaging, further below. 

When combining Galactic and Kepler priors, we simply 
weight the output of the MCMC by the product of the factors 
corresponding to each. This is appropriate because, while the 
6x6 matrix, transforming the full set of microlensing param- 
eters (s, 'y±,y\\, tE, Oe, tie) to the full set of physical parameters 
{a, (p, 6, M, Di, jj), is not block diagonal, the Jacobian neverthe- 
less factors as 

d{a,(p,e,M,DL,ij) _ d{a,4>,0) d(M,DL,tJ) 
d(s,y_i_,y\\,tE,0E,TrE) d(s,y_i_,y\{) d(tE,0E,7TE)' 

Hence, the full weight, fiX) in Eq. ( fT6b is simply the product 
of the two found separately for the Galactic and orbital priors. 

Figure |7] shows the host-mass probability distribution be- 
fore (red) and after (black) applying the constraint from VLT 
imaging to the previous analysis incorporating both Galactic 
and Kepler priors. The 90% confidence interval is marked. The 
high mass solutions toward the right are strongly disfavored by 
the lightcurve (see Figure |6), but the Galactic prior for them 
is so strong that they have substantial posterior probability. 
However, these solutions are heavily suppressed by the VLT 
flux limits. The hsot is most likely to be an M dwarf. The lower 
right panel of Figure |6]shows the 2-dimensional (M, TTrei) prob- 
ability distribution for direct comparison with the results from 
applying various combinations of priors. 

5.5. Bayesian results for physical parameters 

Table 3 shows the median estimates and 90% confidence inter- 
vals for six physical parameters (plus one physical diagnostic) 
as more priors and constraints are applied. The bottom row, 
which includes full Galactic and Kepler priors, plus constraints 
from VLT photometry shows our adopted results. The six phys- 
ical parameters are the host mass M, the planet mass irip, the 
distance of the system Dl, the period P, the semi-major axis a, 
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Fig. 7. Probability as a function of host mass after applying the 
Galactic and Kepler priors {red) and then adding the constraints 
from VLT observations {black). 

and the orbital inclination /. The last three assume a circular or- 
bit. For rows 2 and 4 (which do not apply Kepler constraints), 
the values shown for {P, a, i) summarize the results restricted to 
links in the chain that are consistent with a circular orbit, while 
the first four columns summarize all links in the chain. The key 
results are 



0.071 Mo < Mhost <0.49Mo 



(90% confidence) 



(27) 



and corresponding to this, nip - qM, where q - 0.00132 + 
0.00002, i.e., 

1 .0 Mjup < nip < 6.7 Mjup (90% confidence), (28) 
3.8 yr < P < 7.6 yr (90% confidence) (29) 

1.1 AU < fl < 2.7 AU (90% confidence) (30) 

with the medians at M = 0.19 Mq, nip - 2.6Mjup, P - 5.4 yr, 
a = 1 .8 AU. That is, the host is an M dwarf with a super- Jovian 
massive planetary companion. For completeness, we note that 
in obtaining these results, we have implicitly assumed that the 
probability of a star having a planet with a given planet-star 
mass ratio q and semi-major axis a is independent of the host 
mass and distance. 



about M ~ 0.6 Mq, which corresponds to an Mprog ~ 2 Mq pro- 
genitor If the progenitor had a planet, it would have increased 
its semi-major axis by a factor a/ainit - Mp^og/M ~ 3.3 as 
the host adiabatically expelled its envelope. We find that, for 
M = 0.6 Mq, the orbital semi-major axis is fairly tightly con- 
strained to a - 2.3+0.3 AU, implying Oinit = 0.7 +0.1 AU. It is 
unlikely that such a close planet would survive the AGB phase 
of stellar evolution. Of course, a white dwarf need not be right 
at the peak. For lower mass progenitors, the ratio of initial to 
final masses is lower, which would enhance the probability of 
survival. But it is also the case that such white dwarfs are rarer 



5.7. Physical consistency checks of bayesian analysis 

The results reported here have been derived with the aid of 
fairly complicated machinery, both in fitting the light curves 
and in transforming from microlensing to physical parame- 
ters. In particular, we have identified a strong mathematical 
degeneracy between the parameters tte.n and oj, which arise 
from orbital motion of the Earth and the planet, respectively. 
When considering "MCMC-only" solutions, this degeneracy 
led to extremely large errors in tte.n in Figure |5] which are 
then reflected in similarly large errors in the "light-curve- 
only" contours for host mass and lens-source relative parallax 
in Figure |6] Nevertheless, these large errors gradually shrink 
when the priors are applied in Figure |6] and more so when the 
constraints from VLT observations are added in Figure Q 

We have emphasized that the high-TTE (so low-D^, low- 
M) solutions are very strongly, and improperly, favored by the 
MCMC when it is cast in microlensing parameters, and that 
the Galactic prior (Eq. [18) properly compensates for this. But 
is this really true? The best-fit distance for the Galactic-prior 
model is four times larger than for the MCMC-only model, 
meaning that the term D^/jirei favors the Galactic model by a 
factor ~ 2500. Thus, even if the light curve strongly favored the 
nearby model, the Galactic prior could "trump" the light curve 
and enforce a larger distance. Indeed, this would be an issue 
if the Galactic prior were operating by itself. In fact, however. 
Figure |6] shows that the finally adopted solution (including the 
VLT flux constraint) is disfavored by the light curve alone by 
just Ax^ ~ 3, so, in the end there is no strong tension. 

A second issue is that both parallax and orbital motion are 
fairly subtle effects that could, in principle, be affected by sys- 
tematics. If this were the case, the principal lensing parameters, 
such as q and s, would remain secure, but most of the "higher 
order" information, such as lens mass, distance, and orbital mo- 
tion would be compromised. It is always difficult to test for 
systematics, particularly in this case for which there are two 
effects that are degenerate with each other and in combination 
are detected at only A^^ < 100. 



5.6. White dwarf host? 

When we applied the VLT flux constraint, we noticed that it 
would not apply to white-dwarf hosts. Is such a host other- 
wise permitted? In principle, the answer is "yes", but as we 
now show, it is rather unlikely. The WD mass function peaks at 
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Table 3. Physical parameters 
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60 
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4.99 
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0.25 


3.55 
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Fig. 8. Physical test of Bayesian results: physicality diagnostic fi - £kin,±/£^pot,± is plotted against host distance. Bound orbits 
must have /? < 1, and we expect a priori 0.1 < y6 < 0.5. 
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However, we can in fact test for such systematics using the 
diagnostic 

P^^-P^, (31) 

where Vj^ and Vesc.± are defined in Eqs. (fT2l i and ( fTST l. Bound 
orbits require /3 < 1. Circular orbits, if seen face-on, have /5 = 
0.5 and otherwise p < 0.5. Of course, it is possible to have 
j6 <«c 1, but it requires very special configurations to achieve 
this. For example, if the planet is close to transiting its host, or 
if the orbit is edge-on and the phase is near quadrature. Thus, 
a clear signature of systematics would be j6 > 1 for all light- 
curve solutions with reasonable And if jS ^ 0.1, one should 
be concerned about systematics, although this condition would 
certainly not be proof of systematics. With these considerations 
in mind, we plot Dl vs. y6 in Figure |8] 

The key point is that the 1 cr region of the Galactic -prior 
panel straddles the region /3 ^ 0.5 (log/? ^ -0.7), which is 
characteristic of approximately circular, approximately face-on 
orbits. It is important to emphasize that no selection or weight- 
ing by orbital characteristics has gone into construction of this 
panel. This is a test which could easily have been failed if the 
orbital parameters were seriously influenced by systematics: 
could have taken literally on any value. 

Finally, we turn to the two righthand panels, which incorpo- 
rate the orbital constraints. Since these assume circular orbits, 
they naturally eliminate all solutions with /3 > 0.5, and some 
smaller-y6 solutions as well, because when ds/dt + 0, it is im- 
possible to accommodate a y6 = 0.5 circular orbit. While this 
radical censoring of the high-j8 solutions is the most dramatic 
aspect of these plots, there is also the very interesting efifect that 
low-y6 solutions are also suppressed (though more gently). This 
is because, as mentioned above, these require special configu- 
rations and so are disfavored by the Kepler Jacobian, Eq. (IZST l. 
Of course, radical censorship of /? > 1 solutions is entirely ap- 
propriate (provided that y6 < 1 solutions exist at reasonable ;t'^), 
but what about 0.5 % fi < 1? A more sophisticated approach 
would permit non-circular orbits and then suppress these solu- 
tions "more gently" using a Jacobian (as is already being for 
done low-y6 solutions). However, as we have emphasized, the 
limited sensitivity of this event to additional orbital parameters 
does not warrant such an approach. Hence, radical truncation 
is a reasonable proxy in the present case for the "gentler" and 
more sophisticated approach. 

Moreover, one can see by comparing Rows 2 and 3 of Table 
3 that the addition of Kepler priors does not markedly alter the 
Galactic -prior solutions. 

6. Conclusions 

We report the discovery of the planetary event MOA- 
2009-BLG-387Lb. The planet/star mass ratio is very well- 
determined, q - 0.0 1 32 +0.0003 . We constrain the host mass to 
he in the interval. 0.07 < M^ost/MQ < 0.49 at 90% confidence, 
which corresponds to the full range of M dwarfs. The planet 
mass therefore lies in the range 1.0 < »Jp/Mjup < 6.7 , with 
its uncertainty almost entirely due to the uncertainty in the host 



mass. The host mass is determined from two "higher-order" 
microlensing parameters, 6e and tte, (i.e., M - 0e/kjte)- 

The first of these, the angular Einstein radius is actually 
quite well measured, 6e - 0.31 + 0.03 mas, from four sepa- 
rate caustic-crossings by the source during the event. On the 
other hand, from the light-curve analysis alone, the microlens- 
ing parallax vector tte is poorly constrained because one of its 
components is degenerate with a parameter describing orbital 
motion of the lens. That is, efifects of the orbital motion of our 
planet (Earth) and the lens planet have a similar impact on the 
light curve and are difficult to disentangle. 

Nevertheless, the closest-lens (and so also lowest-lens- 
mass) solutions permitted by the light curve are strongly dis- 
favored by the Galactic model simply because there are rel- 
atively few extreme-foreground lenses that can reproduce the 
observed light-curve parameters. Of course, we cannot abso- 
lutely rule out the possibility that we are victims of chance, so 
in principle it is possible that the host is an extremely low-mass 
brown dwarf, or even a planet, with a lunar companion. 

On the other hand, the arguments against a higher mass lens 
rest on directly observed features of the light curve. That is, as 
mentioned above, 6e is measured accurately from the four ob- 
served caustic crossings. And one component of tte, the one in 
the projected direction of the Sun, is also reasonably well mea- 
sured from the observed asymmetry in the light curve outside 
the caustic region. This places a lower limit on ke, hence an 
upper limit on the mass. 

However, for the latter parameter, the very strong prior 
from the Galactic model favoring more distant lenses would, 
by itself, "overpower" the lightcurve and impose solutions with 
M > 1 M0, which are disfavored by the lightcurve at > 3 cr. It 
is only because these high-mass solutions are ruled out by flux 
limits from VLT imaging that the lightcurve-only is quite 
compatible with the final, posterior-probability solution. 

The relatively high planet/star mass ratio (implying a 
Jupiter-mass planet for the case of a very late M-dwarf host) 
is then difficult to explain within the context of the standard 
core-accretion paradigm. 

The 12-day duration of the planetary perturbation, one of 
the longest seen for a planetary microlensing event, enabled 
us to detect two components of the orbital motion, basically 
the projected velocity in the plane of the sky perpendicular 
and parallel to the star-planet separation vector While the first 
of these is strongly degenerate with the microlens parallax (as 
mentioned above), the second one (which induces a changing 
shape of the caustic) is reasonably well constrained by the two 
sets of well-separated double caustic crossings. Moreover, once 
the Galactic-model prior constrained the microlensing paral- 
lax, its correlated orbital parameter was implicitly constrained 
as well. With two orbital parameters, plus two position param- 
eters from the basic microlensing fit (projected separation s, 
and orientation of the binary axis relative to the source motion 
a) plus the lens mass, there is enough information to specify 
an orbit, if the orbit is assumed circular. We are thus able to 
estimate a semi-major axis a = 1.8 AU and period 5.4 years. 

We recognized that inferences derived from such subtle 
light curve effects could in principle be compromised by sys- 
tematics. We therefore tested whether the derived ratio of or- 
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bital kinetic to potential energy was in the expected range, be- 
fore imposing any orbital constraints. If the measurements were 
strongly influenced by systematic errors, this ratio could have 
taken on any value. In fact, it fell right in the expected range. 
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